Normalization of ASR confidence classifier scores via confidence mapping
نویسندگان
چکیده
Speech recognition confidence classifier (CC) score quantitatively represents the correctness of decoded utterances in a [0,1] range. We associate an operating threshold with the classifier and accept recognitions with scores greater than the threshold. Speech developers may set their own threshold but often an acoustic model (AM) or CC update alters the correct-accept (CA) vs. false-accept (FA) profile, necessitating a threshold reselection. This is specifically a problem when, (a) threshold is hardcoded with a shipped hardware or software, (b) developers may not have expertise for threshold tuning, (c) tuning isn’t cost-effective and may need to be done often. To our knowledge, our work is the first to present this practical and interesting problem of avoiding threshold reselection and proposes novel confidence-mapping-based techniques to improve or retain both CA and FA at previously set thresholds. We propose and evaluate, (a) histogram-based mapping, (b) polynomial-fitting, (c) tanh-fitting, based methods to map confidences associated with false-recognitions and discuss their issues and benefits. In our tests, all of the above mapping methods fix the mean regression in CA of 21% to a gain to 1-2%, with tanh-mapping providing the best CA and FA tradeoff in our tests.
منابع مشابه
Improved text overlay detection in videos using a fusion-based classifier
In this paper, classifier fusion is adopted to demonstrate improved performance for our text overlay detections in the NIST TREC-2002 Video Retrieval Benchmark. A normalized ensemble fusion is explored to combine two text overlay detection models. The fusion incorporates normalization of confidence scores, aggregation via combiner function, and an optimize selection. The proposed fusion classif...
متن کاملبهبود کارایی سیستم کاوشگر کلمات تلفنی با استفاده از نرمالیزاسیون امتیاز اطمینان مبتنی بر روش برنامهریزی خطی
Conventional word spotting systems determine hypothesized keywords and their confidence score using a speech recognizer. Acceptance or rejection of these keywords is intended based on comparison of their scores with a specific threshold. It has been proved that confidence score prepared by recognizer is highly dependent on sub-word structure of each keyword. So comparing assigned scores to keyw...
متن کاملImproving performance of an HMM-based ASR system by using monophone-level normalized confidence measure
In this paper, we propose a novel confidence scoring method that is applied to N-best hypotheses output from an HMM-based classifier. In the first pass of the proposed method, the HMM-based classifier with monophone models outputs N-best hypotheses (word candidates) and boundaries of all the monophones in the hypotheses. In the second pass, an SM (Sub-space Method)-based verifier tests the hypo...
متن کاملConfidence Evaluation for Combining Diverse Classifiers
For combining classifiers at measurement level, the diverse outputs of classifiers should be transformed to uniform measures that represent the confidence of decision, hopefully, the class probability or likelihood. This paper presents our experimental results of classifier combination using confidence evaluation. We test three types of confidences: log-likelihood, exponential and sigmoid. For ...
متن کاملVerification of unemployment benefits’ claims using Classifier Combination method
Unemployment insurance is one of the most popular insurance types in the modern world. The Social Security Organization is responsible for checking the unemployment benefits of individuals supported by unemployment insurance. Hand-crafted evaluation of unemployment claims requires a big deal of time and money. Data mining and machine learning as two efficient tools for data analysis can assist ...
متن کامل